Goto

Collaborating Authors

 Kakinada


Constrained Centroid Clustering: A Novel Approach for Compact and Structured Partitioning

arXiv.org Machine Learning

This paper presents Constrained Centroid Clustering (CCC), a method that extends classical centroid-based clustering by enforcing a constraint on the maximum distance between the cluster center and the farthest point in the cluster. Using a Lagrangian formulation, we derive a closed-form solution that maintains interpretability while controlling cluster spread. To evaluate CCC, we conduct experiments on synthetic circular data with radial symmetry and uniform angular distribution. Using ring-wise, sector-wise, and joint entropy as evaluation metrics, we show that CCC achieves more compact clusters by reducing radial spread while preserving angular structure, outperforming standard methods such as K-means and GMM. The proposed approach is suitable for applications requiring structured clustering with spread control, including sensor networks, collaborative robotics, and interpretable pattern analysis.


GraphRank Pro+: Advancing Talent Analytics Through Knowledge Graphs and Sentiment-Enhanced Skill Profiling

arXiv.org Artificial Intelligence

The extraction of information from semi-structured text, such as resumes, has long been a challenge due to the diverse formatting styles and subjective content organization. Conventional solutions rely on specialized logic tailored for specific use cases. However, we propose a revolutionary approach leveraging structured Graphs, Natural Language Processing (NLP), and Deep Learning. By abstracting intricate logic into Graph structures, we transform raw data into a comprehensive Knowledge Graph. This innovative framework enables precise information extraction and sophisticated querying. We systematically construct dictionaries assigning skill weights, paving the way for nuanced talent analysis. Our system not only benefits job recruiters and curriculum designers but also empowers job seekers with targeted query-based filtering and ranking capabilities.


Performance Evaluation of Sentiment Analysis on Text and Emoji Data Using End-to-End, Transfer Learning, Distributed and Explainable AI Models

arXiv.org Artificial Intelligence

Emojis are being frequently used in todays digital world to express from simple to complex thoughts more than ever before. Hence, they are also being used in sentiment analysis and targeted marketing campaigns. In this work, we performed sentiment analysis of Tweets as well as on emoji dataset from the Kaggle. Since tweets are sentences we have used Universal Sentence Encoder (USE) and Sentence Bidirectional Encoder Representations from Transformers (SBERT) end-to-end sentence embedding models to generate the embeddings which are used to train the Standard fully connected Neural Networks (NN), and LSTM NN models. We observe the text classification accuracy was almost the same for both the models around 98 percent. On the contrary, when the validation set was built using emojis that were not present in the training set then the accuracy of both the models reduced drastically to 70 percent. In addition, the models were also trained using the distributed training approach instead of a traditional singlethreaded model for better scalability. Using the distributed training approach, we were able to reduce the run-time by roughly 15% without compromising on accuracy. Finally, as part of explainable AI the Shap algorithm was used to explain the model behaviour and check for model biases for the given feature set.


Information Security and Privacy in the Digital World: Some Selected Topics

arXiv.org Artificial Intelligence

Recent developments in hardware and information technology have enabled the emergence of billions of connected, intelligent devices around the world exchanging information with minimal human involvement. This paradigm, known as the Internet of Things (IoT), is progressing quickly, with an estimated 27 billion devices by 2025 (almost four devices per person) [1, 2]. These smart devices help improve our quality of life, with wearables to monitor health, vehicles that interact with traffic centers and other vehicles to ensure safety, and various home appliances offering comfort. This increase in the number of IoT devices and successful IoT services has generated tremendous data. The International Data Corporation report estimates that by 2025 this data will grow from 4 to 140 zettabytes [3].


From ChatGPT to ThreatGPT: Impact of Generative AI in Cybersecurity and Privacy

arXiv.org Artificial Intelligence

Undoubtedly, the evolution of Generative AI (GenAI) models has been the highlight of digital transformation in the year 2022. As the different GenAI models like ChatGPT and Google Bard continue to foster their complexity and capability, it's critical to understand its consequences from a cybersecurity perspective. Several instances recently have demonstrated the use of GenAI tools in both the defensive and offensive side of cybersecurity, and focusing on the social, ethical and privacy implications this technology possesses. This research paper highlights the limitations, challenges, potential risks, and opportunities of GenAI in the domain of cybersecurity and privacy. The work presents the vulnerabilities of ChatGPT, which can be exploited by malicious users to exfiltrate malicious information bypassing the ethical constraints on the model. This paper demonstrates successful example attacks like Jailbreaks, reverse psychology, and prompt injection attacks on the ChatGPT. The paper also investigates how cyber offenders can use the GenAI tools in developing cyber attacks, and explore the scenarios where ChatGPT can be used by adversaries to create social engineering attacks, phishing attacks, automated hacking, attack payload generation, malware creation, and polymorphic malware. This paper then examines defense techniques and uses GenAI tools to improve security measures, including cyber defense automation, reporting, threat intelligence, secure code generation and detection, attack identification, developing ethical guidelines, incidence response plans, and malware detection. We will also discuss the social, legal, and ethical implications of ChatGPT. In conclusion, the paper highlights open challenges and future directions to make this GenAI secure, safe, trustworthy, and ethical as the community understands its cybersecurity impacts.


Applying Machine Learning to DevOps

@machinelearnbot

The world of using static tooling for packaging, provisioning, deployments, and monitoring, APM and log management will be over. With Docker adoption, the Cloud and API driven approaches and micro-services to deploying applications at a large scale, ensuring high reliability, requires an excellent take. So, it's essential to include creative managing tools for cloud instead of reinventing the wheel every time. With the rise of ML and AI, more DevOps tooling vendors are incorporating intelligence with their offerings for further simplifying the task of engineers. Machine Learning is the practical application of Artificial Intelligence (AI) in the form of a set of programs or algorithms. The aspect of learning relies on training time and data.


An Introduction to LUIS (Language Understanding Intelligent Service) - DZone AI

#artificialintelligence

Language Understanding Intelligent Service (LUIS) enables developers to build smart applications that can understand human language and respond accordingly to user requests. Let's first try to understand why we need LUIS. In a web application, you search for a functionality in the menu. The menu, menu items, screen layout, and navigation vary for each application. Before you use any application, you need to familiarize yourself with the menu items and the navigation. The same functionality may have different names in different applications.


An Extensive Report on Cellular Automata Based Artificial Immune System for Strengthening Automated Protein Prediction

arXiv.org Artificial Intelligence

Artificial Immune System (AIS-MACA) a novel computational intelligence technique is can be used for strengthening the automated protein prediction system with more adaptability and incorporating more parallelism to the system. Most of the existing approaches are sequential which will classify the input into four major classes and these are designed for similar sequences. AIS-MACA is designed to identify ten classes from the sequences that share twilight zone similarity and identity with the training sequences with mixed and hybrid variations. This method also predicts three states (helix, strand, and coil) for the secondary structure. Our comprehensive design considers 10 feature selection methods and 4 classifiers to develop MACA (Multiple Attractor Cellular Automata) based classifiers that are build for each of the ten classes. We have tested the proposed classifier with twilight-zone and 1-high-similarity benchmark datasets with over three dozens of modern competing predictors shows that AIS-MACA provides the best overall accuracy that ranges between 80% and 89.8% depending on the dataset.